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We present one-shot compression protocols that optimally encode ensembles of N identically 
prepared mixed states into O (log IV) qubits. In contrast to the case of pure-state ensembles, we 
find that the number of encoding qubits drops down discontinuously as soon as a nonzero error is 
tolerated and the spectrum of the states is known with sufficient precision. For qubit ensembles, 
this feature leads to a 25% saving of memory space. Our compression protocols can be implemented 
efficiently on a quantum computer. 


Storing data into the smallest possible space is of cru¬ 
cial importance in present-day digital technology, espe¬ 
cially when dealing with large amounts of information 
and with limited memory space [1]. The need for sav¬ 
ing space is even more pressing in the quantum domain, 
where storing data is an expensive task that requires so¬ 
phisticated error correction techniques [2-4]. 

For quantum data, Schumacher’s compression [5] and 
its extensions [6-10] provide optimal ways to store in¬ 
formation in the asymptotic limit of many identical and 
independent uses of the same source. However, in many 
situations there may be correlations from one use of the 
source to the next. In such situations, it is convenient 
to regard N uses of the original source as a single use of 
a new source, which emits messages of length N. This 
scenario is an instance of one-shot quantum data com¬ 
pression [11]. An important example of one-shot com¬ 
pression is when the states emitted at N subsequent mo¬ 
ments of time are perfectly correlated, resulting in code¬ 
words of the form p® N for some density matrix p x and 
some random parameter x. This situation arises when 
the original source is an uncharacterized preparation de¬ 
vice, which generates the same quantum state at every 
use. For quantum bits (qubits), Plesch and Buzek [12] 
observed that every ensemble of identically prepared pure 
states can be stored without any error into \og(N + 1) 
qubits, thus allowing for an exponential saving of mem¬ 
ory space. Recently, Rozema et al [13] brought this idea 
into the realm of experiment, demonstrating a prototype 
of one-shot compression in a photonic setup. 

The possibility of implementing one-shot compression 
in the lab opens new questions that require one to go 
beyond the ideal case of pure states and no errors. First, 
due to the presence of noise, real-life implementations 
typically involve mixed states—think, e. g., of quantum 
information processing with NMR [14], where the stan¬ 
dard is to have thermal states at a given temperature, 
or, more generally, of mixed-state quantum computing 
[15-19]. For mixed states, the basic principle of pure- 
state compression does not work: in the qubit case, for 
example, projecting the quantum state into the smallest 
subspace containing the code words does not lead to any 
compression if the states p® N are mixed, because in that 
case the smallest subspace is the whole Hilbert space. 


As a result, it is natural to search for compression proto¬ 
cols that work for mixed states and to ask which proto¬ 
cols achieve the best compression performance. An even 
more important question is how the number of qubits 
needed to store data depends on the errors in the de¬ 
coding. Tolerating a nonzero error is natural in real-life 
implementations, which typically suffer from noise and 
imperfections. 

In this Letter we answer the above questions, propos¬ 
ing compression protocols for ensembles of identically 
prepared mixed states. We first analyze the zero-error 
scenario, showing that the storage of N mixed qubits 
with known purity and unknown Bloch vector requires a 
quantum memory of at least 21og./V qubits. The size of 
the required memory is twice that of the required mem¬ 
ory for pure states, but it is still exponentially smaller 
that the initial data size. The maximum compression 
is achieved by a protocol that does not require knowl¬ 
edge of the purity. We then investigate the more realistic 
case of protocols with an error tolerance. When the pu¬ 
rity is known with sufficient precision, we find out that 
tolerating an error, no matter how small, allows one to 
encode the initial data into only 3/2 \ogN qubits, plus 
a small correction independent of N. Remarkably, the 
discontinuity in the error parameter takes place as soon 
as the prior knowledge of the purity is more precise than 
the knowledge that could be gained by measuring the 
N input qubits. The existence of a discontinuity is a 
striking deviation from the pure-state case, for which we 
prove that there is no significant advantage in introduc¬ 
ing an error tolerance. Furthermore, we show that our 
compression protocol can be implemented efficiently and 
that the compression rate is optimal under the require¬ 
ments that the encoding be rotationally covariant and 
the decoding preserve the magnitude of the total angular 
momentum. These assumptions are relevant in physical 
situations where the mixed states are used as indicators 
of spatial directions [20, 21] and the decoding operations 
are limited by conservation laws [22-27]. All our results 
can be generalized to quantum systems of arbitrary fi¬ 
nite dimension, where we quantify how the presence of 
degeneracy in the spectrum affects the compression rates. 

Let us start from the qubit case, assuming N to be even 
for the sake of concreteness. We denote by £ : —>• 



2 


H cnc (2? : H e nc —> H® N ) the encoding (decoding) chan¬ 
nel, where H is the Hilbert space of a single qubit and 
H e nc is the Hilbert space of the encoding system. For an 
ensemble of identically prepared qubit states {p x N ■ p x } 
the average error of the compression protocol is 


the TV qubits can be decomposed as 

JV/2 

H® N (2) 

3=0 


e-N 



v°£(pf N ) 
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||A|| denoting the trace norm. We consider ensembles 
where all the states p x have the same purity, which is 
assumed to be perfectly known (this assumption will be 
lifted later). Let us write p x as p n = p|n)(n| + (1 — 
p) |— n)(—n|, where |n) denotes the two-dimensional pure 
state with Bloch vector n = ( n x ,n y ,n z ) and p > 1/2 
is the maximum eigenvalue. We focus on mixed states 
(p ^ 1), excluding the trivial case p = 1/2, in which the 
ensemble consists of just one state. For p £ {1,1/2}, we 
call the ensemble {p® N ,p n } complete if the probability 
distribution p n is dense in the unit sphere. The typi¬ 
cal example is an ensemble of mixed states with known 
purity and completely unknown Bloch vector. For ev¬ 
ery complete ensemble we demonstrate a sharp contrast 
between two types of compression: (i) zero-error com¬ 
pression, wherein the decoded state is equal to the initial 
state, and (ii) approximate compression, wherein small 
errors are tolerated. In the zero-error case we have the 
following 


Theorem 1. The minimum number of logical qubits 
needed to compress a complete N-qubit ensemble is 
[21og(TV + 2) — 2]. Every compression protocol that has 
zero error on a complete ensemble must have zero error 
on every ensemble of identically prepared mixed states 
and on every ensemble of permutationally invariant N- 
qubit states. 


Intuitively, the reason for the exponential reduction of 
the number of qubits is that the states in the ensemble 
are invariant under permutations and, therefore, they do 
not carry all the information that could be encoded into 
TV qubits. This observation was anticipated by Blurne- 
Kohout et al in the context of state discrimination and 
tomography [28]. The key point of Theorem 1 is the 
optimality proof, which establishes that if a mixed-state 
ensemble is complete, then compressing it is as hard as 
compressing any arbitrary ensemble of permutationally 
invariant states [29]. 

In preparation of our analysis of approximate compres¬ 
sion, it is instructive to look into an optimal protocol 
achieving zero-error compression. The starting point is 
the Schur-Weyl duality [30], stating that there exists a 
basis in which the TV-fold tensor action of the group GL(2) 
and the natural action of the permutation group Sn are 
both block diagonal. In this basis, the Hilbert space of 


where j is the quantum number of the total angular mo¬ 
mentum, IZj is a representation space, in which the group 
GL(2) acts irreducibly, and A4j is a multiplicity space, in 
which the group acts trivially. Now, since the state p® N 
is invariant under permutations of the TV qubits, one has 

P® N = @ '/; v (p n ,j <8> "‘ J ) , (3) 

j —o ' i ' 


where qj i n is a suitable probability distribution in j , p nj 
is a quantum state on IZj, I m . is the identity on A ij, 
and uij is the dimension of A ij. From Eq. (3) it is 
obvious that all information about the input state lies 
in the representation spaces. Hence, p® N can be en¬ 
coded faithfully into the state E (p® N ) = © ; qj,N pa, j- 
Such state has an exponentially smaller support, con¬ 
tained in the space Hn ■— Hj, whose dimension 

is dim Hn = (TV/2 + l) 2 . Hence, the initial state can be 
encoded into [log dim "Hat] qubits—the amount declared 
in Theorem 1. A perfect decoding is achieved by the 
channel 


V(P) “® (P3PP3® 

3 ' 



( 4 ) 


where Pj is the projector on the representation space IZj. 

Considering that qubits are a costly resource, it is 
worth pointing out a slight modification of the above pro¬ 
tocol, which uses approximately log TV qubits and log TV 
classical bits. The modified protocol consists in (i) mea¬ 
suring the value of j, thus projecting TV qubits into the 
state p n j (ii) discarding the multiplicity part, 

(iii) encoding the state p n j into [log(TV + 1)] qubits, 
and (iv) transmitting the encoded state to the receiver, 
along with a classical message specifying the value of j. 
Knowing the value of j, the receiver can append an ad¬ 
ditional system in the state I mj /mj and embed the state 
Pn,j ® . / trij into the right subspace. 

Let us consider now the more realistic case of approxi¬ 
mate compression. Here, the number of encoding qubits 
drops down discontinuously. 


Theorem 2. For every allowed error rate e > 0 and 
for every complete qubit ensemble, there exists a number 
TVo > 0 such that for any TV > TVo the ensemble can 
be encoded into 3/2 log TV + log[4(2p — l)y / ln(2/e)] qubits 
with error smaller than e. 


The idea is to work out the explicit form of the prob- 
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ability distribution in Eq. (3), given by 


Qj,N 


2 j + l 
2 jo 


B(N + l,p,*+j + lj 
- B (n+1,p,^ -jj 


( 5 ) 


where B (n, p , k) is the binomial distribution with n trials 
and with probability p, and j 0 = (p — 1/2) (iV + 1). For 
large N, the distribution qjjy is approximately the prod¬ 
uct of a linear function with the normal distribution of 
variance (N + l)p(l — p) centered around jo- In order to 
compress, we get rid of the tails: for every e > 0 , we select 
a set S e := {jo - [y/ln(2/e)N\,... ,j 0 + [sJ\n(2/e)N\ j 

and we compress the state p® N into the encoding space 
B-cnc = ©je s e 7 Zj, by applying the quantum channel 

£(p) : = ® Ti >q [ n jp^j} + Tr p\ p° ' 

jes e j& s e 

where II j is the projector on Tr^ is the partial 

trace over A ij , and po is a fixed state with support inside 
Tjcnc- The encoding space has dimension 


dim'Henc = ^2 ( 2 i + 1 ) < ( 2 jo + 1 ) + 1 J > 

growing as TV 3 / 2 . The initial state can be recovered, up 
to error e, by a suitable decoding channel [29]. 

Theorem 2 guarantees that N identical copies of a 
mixed state with known purity can be stored faithfully 
to e into 3/2 log N qubits, plus an overhead that is dou¬ 
bly logarithmic in 1/e. This result is good news for fu¬ 
ture implementations, because the overhead grows slowly 
with the required accuracy. For example, when p = 0.6, 
N = 20 identically prepared qubits with Bloch vectors 
pointing in arbitrary direction can be compressed into 8 
qubits with an error smaller than 1%. In addition to the 
fully quantum version of the protocol, one can construct 
a hybrid version where the initial state is stored partly 
into qubits and partly into classical bits, as discussed in 
the zero-error case. In the hybrid version, the disconti¬ 
nuity between zero-error and approximate compression 
pertains to the number of classical bits needed to com¬ 
municate the value of j, which decreases from log N to 
1/2 log N as soon as a nonzero error is tolerated. 

Our result highlights a radical difference between 
mixed and pure states: for mixed states, every finite 
error tolerance e > 0 allows one to reduce the size of 
the compression space from the original 2 log N qubits 
to 3/2 log TV qubits. Such a discontinuity does not take 
place for pure states: for pure states with completely 
unknown Bloch vector, every compression protocol with 
tolerance e requires at least (1 — 2e) log N qubits [29]. 

It is worth commenting on the importance of know¬ 
ing the purity. Our approximate protocol requires the 


purity to be perfectly known, so that one can encode 
only the subspaces where the quantum number j is in 
a strip around the most likely value. If the purity is 
only partially known, the protocol can be adapted by 
broadening the size of the strip, i. e., by changing the set 
S e . Specifically, suppose that the eigenvalues of p n are 
known up to an error A p = 0(N~ 7 ), with 7 > 1/2. In 
this case, the number of encoding qubits can be reduced 
to 3/2 logN + g(e,'y) where g is a function depending on 
e and 7 , but not on N. Hence, the discontinuity between 
zero-error and approximate compression persists. How¬ 
ever, the situation is different if the eigenvalues are known 
with less precision: if the error in the specification of the 
eigenvalues scales as TV -7 with 7 < 1/2, then the num¬ 
ber of encoding qubits becomes (2 — 7 ) log N. Quite in- 
triguingly, the separation between the two regimes takes 
place exactly when the knowledge of the eigenvalues be¬ 
comes more precise than the knowledge that could be 
extracted through spectrum estimation [31]. Note that 
our protocol can be combined for free with spectrum es¬ 
timation, which only requires measuring the value of j. 
However, the a posteriori knowledge of the measurement 
outcome cannot replace the a priori knowledge of the 
spectrum: indeed, finding the outcome j leads to esti¬ 
mating the maximum eigenvalue as p = 1/2 + j/(N + 1) 
[31] and then to encoding the state p n j into [log(2j +1)] 
qubits. In order to decode, the receiver needs a classi¬ 
cal message communicating the value of j, which requires 
[log(TV/2+l)] bits in the one-shot scenario. This leads to 
the same resource scaling as in the zero-error case, i. e., 
approximately log A qubits to send the encoded state 
and log A bits to communicate j. 

The protocol of Theorem 2 is optimal within the physi¬ 
cally relevant class of protocols constrained by covariance 
under rotations and by the preservation of the magnitude 
of the angular momentum. More precisely, we have the 
following [29]. 

Theorem 3. Every compression protocol that encodes 
a complete N-qubit ensemble into (3/2 — 6) log N qubits 
with covariant encoding and a decoding that preserves the 
magnitude of the total angular momentum will necessar¬ 
ily have error e > 1/2 in the asymptotic limit. 

Let us now discuss the complexity of the compression 
protocol. To operate on the input state we use the Schur 
transform [12, 32, 33], which transforms the initial N 
qubits together with O(logiV) ancillary qubits into three 
registers: (i) the index register, where the value of j 
is stored into the state of log(iV/2 + 1) qubits, (ii) the 
representation register, which uses log(TV + 1) qubits to 
encode the representation spaces, and (iii) the multiplic¬ 
ity register, where the multiplicity spaces are encoded 
into 0{N ) qubits (see Fig. 1). Since the implementation 
of the Schur transform in a quantum circuit is approxi¬ 
mate, we focus on approximate compression, so that the 
Schur transform error can be absorbed into the compres- 
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FIG. 1. A quantum circuit for encoding. The Schur 
transform turns the initial N qubits together with K = 
O (log TV) ancillary qubits into three registers: the index reg¬ 
ister J, the representation register 1Z, and the multiplicity 
register M. The multiplicity register is discarded. The index 
register is encoded into N /2 + 1 qubits by the position embed¬ 
ding Vjv/2+i- The qubits in positions outside S e are discarded 
and the remaining qubits are reencoded into [log |S e |] qubits. 



FIG. 2. A quantum circuit for decoding. The first op¬ 
eration is the position embedding Vk.i, which produces jS e | 
output qubits. The jth of these qubits controls the gener¬ 
ation of a maximally mixed state of rank rrij (achieved by 
the controlled operation Gj , represented explicitly in the blue 
inset for rrij = 4). The third step is the initialization of 
L = N/ 2 + 1 — | S e | qubits which are put in positions corre¬ 
sponding to values of j outside S e . After a total of N/2 + 1 
qubits are in place, the inverse of the position embedding 
is performed, followed by the inverse of the Schur trans¬ 
form. The output of the circuit is a state on N qubits and 
K = O(loglV) ancillas, which are finally discarded. 

sion error. Let us analyze first the encoding. The first 
step is the approximate Schur transform, whose com¬ 
plexity is poly(A, log 1 /e'), e' being the approximation 
error [32, 33]. We set e' to be vanishing exponentially in 
N, resulting in a complexity poly (N) for the implemen¬ 
tation of the Schur transform. After the Schur trans¬ 
form has been performed, the encoding circuit embeds 
the index register into an exponentially larger register 
of N/2 + 1 qubits, transforming the state | j) into the 
state where the jth qubit is set to | 1 ) and the rest of 


the qubits are set to |0) [12]. We refer to this trans¬ 
formation as position embedding and denote it by Vd, 
where D is the dimension of the register that is being 
embedded (in this case D = N/2 + 1). The point of 
position embedding is to physically encode the value of 
j in a form that makes it easy to check whether or not 
j belongs to the set S e . In fact, such a check can be 
equivalently implemented on a classical computer. After 
this step, the circuit discards the qubits in positions out¬ 
side the set S e and transforms the remaining qubits into 
log 15c | qubits, by applying Now, the complexity 

of position embedding is upper bounded by D(logD) 2 
[12]. Since j ranges from 0 to N/2, the total complex¬ 
ity of the position embedding and of its inverse scales as 
N(\ogN) 2 . From the above reasoning, it is clear that the 
bottleneck of the encoding is the implementation of the 
Schur transform, which leads to an overall complexity of 
poly(iV) for the encoding circuit. The situation is simi¬ 
lar for the decoding, which also uses position embedding 
to perform operations depending on j (see Fig. 2). The 
only new parts are the initialization of N/2 + 1 — |S e | 
qubits in the index register and the preparation of maxi¬ 
mally mixed states of rank irij in the multiplicity register, 
which can be approximately generated with exponential 
precision in 0(N 2 ) operations [29]. Summing over the 
values of j in S e , we then obtain a number of opera¬ 
tions upper bounded by 0(A 2 )|S e | = 0(N 5 / 2 ). From 
the above count it is clear that the overall complexity 
is polynomial in N. In addition to the computational 
complexity, it is worth discussing the size of the ancillary 
systems needed in our compression protocol. Since the 
multiplicity register is discarded, the Schur transform in 
our protocol needs only an ancilla of 0(log N) qubits [28]. 
The position embeddings require ancillas of size O(N), 
but, as mentioned earlier, they can be implemented on 
a classical computer. Hence, the total number of qubits 
that need to be kept coherent throughout our protocol 
scales only as O (log IV). 

Our compression protocol, presented for qubits, can 
be generalized to quantum systems of arbitrary dimen¬ 
sion d. In this case, an ensemble of N identically 
prepared rank-r states with known spectrum can be 
compressed with error less than e into approximately 
(2 dr — r 2 — l) /2 log N qubits. In addition, one can take 
advantage of the presence of degeneracies and further re¬ 
duce the number of qubits: every time the same eigen¬ 
value appears in the spectrum the number of qubits is re¬ 
duced by at least 1/2 log iV (see [29] for the exact value). 
Again, the protocol can be implemented efficiently and 
is optimal under suitable symmetry assumptions [29]. 

In this Letter we showed how to efficiently store en¬ 
sembles of identically prepared quantum systems into an 
exponentially smaller memory space. For mixed states 
we discovered that, whenever a nonzero error is allowed, 
the size of the memory is cut down in a discontinuous 
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way, provided that the spectrum of the state is known 
with sufficient precision. Intriguingly, the dropoff in the 
memory size takes place as soon as the prior informa¬ 
tion about the eigenvalues is more than the information 
that could be extracted by a measurement on the input 
copies. Our approximate compression protocols can be 
implemented efficiently on a quantum computer. 
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PROOF OF THEOREM 1 


Here we show the optimality of our the error protocol in the main text. Specifically, we show that no zero-error 
protocol exists that compresses a complete ensemble of mixed states into less than [21og(IV + 2) — 2]. 



6 


The zero error condition 


The condition for zero-error compression requires that the average error defined as 

\\ P r-vo£( P r)\\ n 

e N = 2_^ Pn U - 2 - = 0 • 

n 


( 7 ) 


This condition immediately implies \\V o £{p® N ) — p® A '| = 0 for every n except for a zero-measure set. Since the 
Hermitian operator V o £(p® N ) — p® N has only zero eigenvalues, it must be a null operator. Hence, the channel 
C := V o £ must fix p® N , namely that 


C(P® N ) = P 


<g)N 

n 


( 8 ) 


for every n except for a set of zero measure. Since p n has full support on the Bloch sphere, the above condition holds 
for a dense set of points on the Bloch sphere. As a result, for every Bloch vector n there exists a sequence of 

Bloch vectors satisfying Eq. (8) such that lim^oo = n and 


lim pfj 

k—>oo k 


®N 


= p 


i»N 

n 


Consequently, we have 


lDo£(p% N )-p%% = 


V o £ I lim p®, N ) — lim p 


«JV 


lim 

k—>o o L 


k—>o o fc / k—> oo k 


Vo£{p®, N )-p® N 


= 0 , 


which implies that C(p® JV ) = p® N for every vector n on the Bloch sphere. 


The algebra associated to the fixed points of a channel 

Here we develop a technique that generates fixed points of a given channel starting from an initial set of fixed 
points. Our technique is based on a result by Blume-Kohout et al [34] characterizes the fixed points. Specifically, 
Theorem 5 of Ref. [34] guarantees that one can find a decomposition of the Hilbert space as H = ® fc (Ck <S> Mk), 
with the property that the fixed points of a given channel acting on T~L are all the operators of the form 

A = , (9) 

k 

where A^ is an arbitrary matrix on Ck and ui^ is a fixed non-negative matrix on A 4k- Using this fact, we develop 
a technique that generates fixed points of a channel starting from an initial set of fixed points. 

Proposition 1. Let Fix(C) be the set of fixed points of channel C, let {A x } x< zx C Fix(C) be a subset of non-negative 
fixed points, and let p(dx) be a non-negative measure on X. Then, the set of operators 

A = E- 1 ' 2 Fix(C) FT 1/2 , E := J p(dx) A x , 

is a matrix *-algebra (i. e. a matrix algebra closed under adjoint). Moreover, one has E 1 ! 2 AE 1 ! 2 C Fix(C). 

[Notation: for a non-invertible operator E, we define U -1 as the inverse on the support of E.] 

Proof. Writing each operator A x in the form (9), we obtain 

E = 0 ( E (fc) <g> J 0 k) ) , £ (fc) = J p(d x)A^ . 
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Hence, for a generic fixed point A £ Fix(C), decomposed as in Eq. (9), we have 


e~ 1 ! 2 ae~ 1 / 2 


©[(£“’) 


- 1/2 


Ai, 


( E (fc) ) 1/2 ®P k , 


(k) 

where Pk is the projector on the support of cJq . Since each Ak is a generic operator on £*., we have 

E- 1 ' 2 Fix(C) ET 1/2 = 0 [B (S k ) ® P k ] , 

k 

where B (Sk) denotes the algebra of all linear operators on the subspace S k = Supp . Hence, A = 
E~ x ! 2 Fix(C) E -1 / 2 is an algebra and is closed under adjoint. On the other hand, we have 

E 1 / 2 AE 1 ' 2 = 0 [B(Sfc)®A4 fc) ] , 

k 

meaning that every operator in E 1 ! 2 AE 1 / 2 is of the form (9)—that is, it is a fixed point. □ 


The minimal algebra required by the zero error condition 

Let us apply Proposition 1 to the channel C = T> o £, resulting from the concatenation of the encoding and the 
decoding in a generic zero-error protocol. By the zero-error condition, all the states p® N are fixed points. The states 
can be decomposed as 


N/2 

°n N = 

j= 0 


N 




( 10 ) 


A priori , this block decomposition could be completely unrelated with the block decomposition of Eq. (9). Proving 
that the two decompositions coincide will be the main part of our argument. 

Choosing the measure p(dx) in Proposition 1 to be the invariant measure over n, the average operator E is given 

by 


E 


N/2 

® 9j> N ( it. 

3=0 



Hence, the algebra A defined in Proposition 1 must contain all the operators of the form 

N/2 

E- i/2 pT E- 112 = 0 (dj Pnj ® I mj ) , 

3=0 


for every unit vector n. Hence, A must contain the smallest algebra A m \ n generated by the above operators. We will 
now characterize this algebra: 


Proposition 2. If the states in Eq. (10) are not maximally mixed, A m in contains the matrix algebra of all operators 
on the symmetric subspace, corresponding to j = N/2 in the decomposition (10). 

Proof. Let us express the state p = p|0)(0| + (1 —p)|l)(l| as p = e -/3z /Tr[e -/3Z ], Z = |0)(0| — 11)(11 for a suitable 
(3 > 0. By definition, for every unitary U € SU(2), the algebra yl m i n contains the operator 


A v := E- 1/2 (UpU^)® N E~ 1/2 

N/2 

= 0 r (uWe-VPuW t 


Tr e~M 




J z j) = X! 


m=-j 


(ii) 



where denotes the (2 j + l)-dimensional irreducible representation of SU(2). Moreover, since the algebra A m in is 
closed under linear combinations, A m in must contain the operator 


Xi 


= J cl U Xjj Au, 


where Xu are the characters of the irreducible representations of SU(2) given by Xjj = Tr[[/W]. Let us set l = N. In 
this case, the orthogonality of SU(2) matrix elements eliminates all terms in the block decomposition of p ® N , except 
for the term with j = Nj 2. Notice that in this case the multiplicity subspace is trivial. Hence, one has 


Xn = J dU X { u N) d N/2 t/ W2) PN/2 U 


o-0 J, 


(JV/2) 


PN/2 — 


Tr 


0-0 Jl 


(JV/2) 


The matrix elements of Xn can be computed explicitly as 


N 

~2' U 


X N 


N 

-w, n ' ) = 


bv/2 


Tr 


0-0 J. 


(JV/2) 


d U X { u } 


N/2 

E 

m——N/2 


E e-’-ij.n 


[/(JV/2) 


N 


\Y' m 


N 


JJ{N/ 2)f 


N / 


= 6 n , n > (- 1 ) ? 


= <w (-1) ? 


d N /2 (f ,n, f ,-ra'|IV,0) 


djv Tr 


o~0 Jl 


(JV/2) 


d N /2 (f ,n, f ,-?i'|IV,0) 


djv Tr 


0-0 Jl 


(JV/2) 


V' 2 / -0^m/ N N 

m—— N/2 
N/2 

E 

m—— N/2 


, 2 ’ 7 2 

(7V!) 2 (—e -/3 ) r 


TV, 0 


(IV/2 - m)\(N/2 + m)\y/(2N)\ 


= 6 n , n > (- 1)" +Ar / 2 


div/ 2 (IV!)e /3JV/2 (l - e~d) N (f ,n, f ,-n'|IV,0) 


d/vvWJlTr 




(JV/2) 


0 'i,TOi,i2,m 2 |J,M) denoting the Clebsch-Gordan coefficient. Note that the Clebsch-Gordan coefficient in the above 
expression is nonzero if and only if n = n '. As a consequence, the operator Xn has full support. 

Now, since A m in is an algebra, it must contain Xn as well as the whole Abelian algebra generated by it. In 
particular, it must contain the projector on the support of Xn —which is nothing but Pn/ 2 , the projector on the 
symmetric subspace. Moreover, it must contain all the operators of the form 

A u>n/ 2 = PnI2AuP n/2 oc t/ (JV/2) e~P J * N/2) I/ (Ar/2)t MU € SU(2). 

Finally, for /3 ^ 0, it is easy to see that the smallest algebra A m i ni jv /2 containing the above operators is the algebra 
B (P-n/ 2 ) • This can be easily seen by von Neumann’s double commutant theorem: If an operator B commutes with 
the non-degenerate Hermitian operator Ajj^n/ 2 for every U, then B must be proportional to the identity. Hence, 
the double commutant of An/ 2 —equal to An /2 itself—is the whole B( 1 Zn/ 2 )- In conclusion, we have the inclusion 
B(Pat/ 2) 1= -dmin,AT/2 5= A m in- □ 

Propositions. If the states in Eq. (10) are neither pure nor maximally mixed, then A m - m is the full algebra generated 
by the N-fold tensor representation o/GL(2), namely 


N/2 

Amin = [B (R-j) ® I mj \ , 

3=0 

B (IZj) denoting the algebra of all linear operators on the representation space IZj. 

Proof. We prove that A lr ,in contains the algebra B(Xj) (g> I m f° r every j. The proof is by induction, with 
j starting from N/2 and going down to 0. For j = N/2 we know that A m in contains the algebra B(1Zn/2) of all 
operators with support in the symmetric subspace. Let us assume that A m in contains all the algebras B(7 Zj) (g> I mj 
with j > j* + 1 and show that it must necessarily contain also the algebra ® dm,, • By construction, we know 

that A m in contains all the operators Au of the form 


N/2 

A(7 = 0 


~o Tr 


tU) 


o-0J^ [/(At, 


.)■ 


J«) = 


J 

E 

m——j 


m \ j, m)(j, m\ 
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Since the states in Eq. (10) are not pure, all the blocks in the sum are non-zero. Moreover, the induction hypothesis 
implies that M m i n should also contain the operators A' u of the form 


3* 


A u ~ ® 


3=0 


Tr 


o~P J: 


U) 


r P J i j) UU) UeSU( 2). 


Now, we can repeat the argument used in the proof of Proposition 2: by linearity, A m i n must contain the operator 

= J dU X ( u A) ^ 


* 3 * 


Tr 


O-PJ; 


O.) 


dU X { u j A 




Explicit calculation (same as in Proposition 2) shows that X^j, has full rank. Hence, the projector on the support of 
X 2 j, is Pj „ = Ij m <8» I m . . Since A m i n should contain this projector, it must also contain all operators of the form 

A'u,j, = Pj.A'uPj , 

a U^* ) U ^ j *^ 0 I mjt , U G 577(2 ). 

Again, using von Neumann’s double commutant theorem, it is easy to show that the smallest algebra containing all 
the above operators is (g> I mj „ • In conclusion we proved that _4 m i n must contain B(7 Zj*) (g) I m . . By induction, 

this proves the inclusion 

N/2 

Amin 2 (J) [B(7£j) 0 I mj \ ■ 

3=0 

In the other hand, the definition of A m i n implies the opposite inclusion. Hence, one must have the equality. □ 


Zero-error compression of a complete ensemble implies zero error compression for every ensemble of 

permutationally invariant states 


Propositions 1 and 3 imply the following 

Corollary 1. If the states (10) are neither pure nor maximally mixed, every channel C preserving them must preserve 
all permutationally invariant states. 

Proof. By Propositions 1 and 3, the channel C must satisfy 

N/2 

Fix(C) D ^min — ® [B(^)0/ mj ] , 

3=0 

meaning that the full algebra generated by the tensor representation of GL(2) is contained in the set of fixed points. □ 
We are now in position to prove Theorem 1 in the main text: 

Proof of Theorem 1. Suppose that a compression protocol has zero error on a complete ensemble of mixed 
states. Then, Corollary 1 implies that the protocol should have zero error on all permutationally invariant states. In 
particular, the protocol should be able to transmit without error the following ensemble of orthogonal pure states 


S := <j Pj,m = \j,m)(j,m\ 0 


j = 0, ■ • •, N/2 ,m = D :=^2 dj 


A lower bound on the dimension d enc of the encoding space Penc is then obtained by considering the amount of 
classical information carried by S. In detail, the lower bound can be calculated using the monotonicity of Holevo’s 
chi quantity in quantum data processing. Holevo’s chi quantity of S [35] is defined as follows 



( Pj,m ) 


X(S) :=H 
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with H(p) being the von Neumann entropy of the state p. Since the chi quantity is non-increasing under quantum 
evolutions, in the zero-error scenario we have 


X (S)= X (S enc ) (12) 

where S e nc is the encoded ensemble S enc := {£(Pj,m)iPj,m}- On the other hand, the dimension of the encoding 
subspace is lower bounded by the chi quantity [8] 

log dene ^ X (^enc) • (13) 

The chi quantity for the ensemble S can be computed as X {&) = log-D. Combining this equality with Eqs. (12) 
and (13) we get 



which concludes the optimality proof. The protocol showed in the main text saturates the bound. □ 


PROOF OF THEOREM 2 


As stated in the main text, we assume p > 1, because for p = 1/2 the ensemble is trivial, consisting only of the 
maximally mixed state. 

We first notice that the error of the compression protocol is upper bounded as 

eN = \\\p® N -Vo£(pf N )\\, VneS 2 


E q :>- 

s« 

^ E q i’ N ■ 

Se 


N 


Pn.j <S> — 1 - ^(Po) 
TO, 


(14) 


the last step following from the triangle inequality and from the fact that the trace distance of two states is upper 
bounded by 2. Note that the upper bound is independent of n, meaning that the protocol works equally well for all 
states with the same spectrum (or equivalently, for all states with the same purity). 

At this point, it is enough to prove that the upper bound vanishes in the large N limit. To this purpose, we use 
the expression for [Eq. (5) in the main text] and observe that one has 


1 — Cv > y; 

J 6 S e 


2(2j + 1) 

Jo 


B ( N+l,p, y +j + l) - Y 

ie s e 


2(2 j + 1) 

Jo 


N 

B ( N + l,p, — - j 


(15) 


where jo = (2 p — 1)(N + l)/2. The second summand in the r.h.s. of Eq. (15) is negligible in the large N limit: 
precisely, it can be bounded as 


E 

Je s e 


2(2 j + 1) 


Jo 


N 


B(N + i, p -j\<Y 


2(2j + 1) 


3=0 


Jo 


N 

B ( N + l,p, —— j 


< 


2p — 1 


3=0 


exp 


S^VTE B ( A ' + 1 .p,f-i 


2(2p — l) 2 Af 2_l 
N+ 1 


(16) 


having used the Hoeffding’s inequality in the last step. Hence, this term goes to zero exponentially fast with N, 
Now, recall that we chose S f to be the interval 


S f = 


jo - 1/2 - y/N\n(2/e),j 0 - 1/2 + y/N ln(2/e) 


(17) 
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Setting jo — j — 1/2 = x, we then obtain 
y/Nln(2/e) 


< i — y. 

x=- v /Nln(2/e) 

ln(2/e) 

= 1 - 


JO 


1-— B {N + l,p,p(N + 1) - x) + 


2p- 1 


exp 


2(2p — l) 2 iV 2 


AT+ 1 


V S(lV + l,p,p(iV + l)-x) + ---exp 

"_ 2p- 1 

x=-j»ln(2/€) 


2(2p — l) 2 iV 2 
1V+ 1 


< 2 exp 


21V , e 
■ in - 


N + l 2 


1 


2p-l 


exp 


2(2p — 1) 2 JV 2 


N + l 


2 N 1 

< e N + 1 + -- 7 exp 


2p — 1 


2(2 p - 1) 2 N 2 
N + l 


In the second last step we have used the Hoeffding’s inequality. Now it can be seen that the right hand side of the 
bound vanishes exponentially fast with N, and we can always find a Nq such that ejv < e 3 / 2 < e for any N > Nq. 
The dimension of the encoded system is now 

dene = (2 j + 1 ) 

jeS, 

= 2(2 p - 1) y/N]n(2/e)(N + 1) 

An upper bound on the number of required qubits is given by 


log dene = log 


2(2p — 1)N\ N In 


+ log 1 + 


1 

N 


< -loglV + log 


2(2 p- 1) V In - 


+ 1 


□ 


THE PURE STATE CASE: NO DISCONTINUOUS GAP BETWEEN ZERO-ERROR AND 

APPROXIMATE COMPRESSION 

Here we prove that the type of discontinuity highlighted by our Theorems 1 and 2 is specific to mixed states. 
Consider the pure state ensemble |(|n)(n|)® Ar , d 2 nj, where |n) is the pure qubit state with Bloch vector n and d 2 n 

is the invariant measure on the Bloch sphere. Suppose that the state (|n)(n|) 0Ar is encoded into a state p n ,enc on a 
Hilbert space of dimension d enc . Assuming that the compression error is bounded by e, an argument by Horodecki [ 8 ] 
gives a lower bound on d enc . The argument is based on the following lemma, based on the Alicki-Fannes inequality 

Lemma 1 ([36]). Let {p x ,p x } be an ensemble of states and let {p XtBnc ,p x } be the ensemble of the encoded states. If 
the compression protocol has error bounded by e, then the following inequality holds 

I X({Px ,Px}) - X({Px,enc,Px})\ < 2 [e log d in + 77 (e)] , (18) 

where di„ is the rank of the average state p = Y^ x PxPx and rj{x) = — xlnx. 

In our case, d ln is the dimension of the symmetric subspace, namely 

d in = dN = N + 1. (19) 


Moreover, we have 


x({(|n)(n|) 0Ar , d 2 nj) =H (l%/dx) = log(N + 1) 


(20) 



12 


and, by the Holevo’s bound [35], 


X ({pn.enc ) d“ ll}) < log d 

enc • (21) 

In our case, we have d- ln = cIn = N + 1. Hence, combining Eqs. (18), (19), (20), and (21) we obtain the bound 

log dene > (1 - 2e) log (AT + 1) - 2 77 (e). 


Now, note that the r.h.s. is continuous in e and tends to log(IV + 1) when e tends to zero. The value log(IV + 1) is 
exactly the minimum number of qubits needed to encode a generic state in the symmetric subspace with zero error. 
Hence, as e tends to zero, the number of qubits needed for approximate compression tends to the number of qubits 
needed for zero-error compression. 


PROOF OF THEOREM 3 

Here we prove the optimality of our protocol among all compression protocols where the encoding is covariant and 
the decoding preserves the magnitude of the total angular momentum. Precisely, we assume that 

1. the encoding space H 0 nc supports a unitary representation of the group SU(2), denoted by {V fl \ g £ SU(2)} 

2 . the encoding channel satisfies the covariance condition 

£oU g = V g o£ 1 \/g g SU(2), (22) 

where U g and V g are the unitary channels defined by U g {-) := U g • C/J and V g = V g ■ Vj. 

3. the decoding channel T> preserve the magnitude of the total angular momentum, in the sense that, for every 
input state p , one has 

Tr [K 2 V(p)] = Tr [j 2 p] , (23) 

where K = (K x , K y , K z ) are the generators of the representation {V g , g £ SU(2)} and J = (J x , J y , J z ) are the 
generators of the representation {Uf N ,g £ SU(2)}. 

Under these conditions, we can prove the optimality of the protocol presented in Theorem 3 of the main text. 

Proof of Theorem 3. For the purpose of this proof, it is convenient to parametrize the mixed states p n as 
p g = UgpU'g , where p is a fixed state and g is a generic element of SU(2). Let us decompose the encoding space as 

?4nc = 0 (0 ® Mj) , (24) 

j 

where j is the quantum number of the angular momentum, 7 Zj is the corresponding representation space, and Mj is 
a suitable multiplicity space. By definition, one has 

?4nc 2 Span {Supp [£ (pf^)] ,g £ SU(2)} 

= Span [Supp(H)] , ft := j dg£ (pf N ) . (25) 

Since £ is covariant, the state f l satisfies the relation V g QVj = , Mg £ SU(2). Hence, Ml can be written in the block 

diagonal form 

°=® - 
jes v 3 7 

where LOj is a suitable state on the multiplicity space and S is a suitable set of values of the angular momentum 
number. Combining the above decomposition with Eq. (25), we obtain the bound 

d e .nc > ra n k Ml > dj . 

jes 


(26) 
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On the other hand, since the decoding preserves the magnitude of the angular momentum, one has 
Tr[n, V o £ (p®")] = Tv[Il : £ (pf N )] , Vj = 0 ,..., A/2 , e SU(2) , 

where IX,- is the projector on 1Z 3 (g> A ij while Ilj is the projector on IZj <g> A ij. Hence, we have 

^Tr[H J Po£:(pf w )] = l, \/g e SU(2), (27) 

16 S 

meaning that all the output states V o £ ( p ® N ) are contained in the subspace Hn '■= ® jes 0^-j ® A4j). Hence, we 
have 

ejv = ^'K JV “ ^(Oll V 5 GSU(2) 

> i :i 7> v [(’a* - Vo£(pf N )P N ]\\ + ±\\(I® N -P N )[pf N ~ V o £ (pf N )} (I® N - P N )\\ 

= ~P N )pf" (I® *-P N )\\ 


where Pjv is the projector on Pat. Now we prove that any protocol with d enc = O (A 3 / 2 <5 ), > 0, will have a 

non-vanishing error. Recall from the main text that the probability distribution qj t jv can be expressed as 


2j + 1 


B [N+l,p, — +j + l -B[N + l,p, — -j 


Qj,N = 


where B(n,p,k) is the binomial distribution with n trials and with probability p and 


jo = (p~ 1/2) (iV + 1). 


Combing Eq. (28) with Eq. (29), we have 


1 1 v- 2j +1 / A \ 


We split the set S into two subsets Si and S 2 , defined as 


_ _ . VcN +1 . v'cA + l 

Si=Sn Jo ---,Jo H- 2 - 

S 2 =S\S! 


where c is an arbitrary constant. The error is then bounded as 

ejv > ^(l-si-s 2 ) s k := ^ ~^—B ( N + 1 ,P,^-+j + 1 ) , k = 1 , 2 . 

jes k Jo ' ' 

We now bound Si and s 2 . Let us start from Si: by definition, we have 


maxj eSl ( 2 j + 1 ) 


X! B \ N+1 ’P’^ +J + 1 

16 Si ' 


= 0(1) Y, B (N + l,p,- + j + l 

j 6Si V 

/ IV 

< 0(1) |Si|P f A + 1 ,p, — + jo + 1 


O^" 1 / 2 ) |S! |. 



14 


In turn, Si can be bounded from the relation 


|Sr| (mm 2 j + 1^ < Y (2 j + 1) 


je Si 
< dp 


= O 


^n 3 / 2 ~ s 


which implies |Si| < 0(N 1 ^ 2 5 ). Inserting this relation into Eq. (31), we finally obtain 

si < O (N~ s ) . 

Regarding S 2 , we have the bound 


S 2 < 


< 


N + l 

jo 

1 

P~ 1/2 

e -c/2 




N 


Y b In + i,p, — +j + i 


Y B^N+l,p, N+j + l 

j<:io ' ' J' 1 


P — 1/2 ’ 

the last inequality coming from Hoeffding’s bound. 

Finally, combining the inequalities (30), (33), and (34), we obtain the lower bound 


1 

6N - 2 


1 - O (iV -5 ) - 


p-c/2 


P — 1/2 J 

Since the constant c is arbitrary, the bound becomes ejv > 1/2 — O (iV -5 ). 


(32) 

(33) 


(34) 


□ 


UPPER BOUND ON THE COMPLEXITY OF GENERATING APPROXIMATE MAXIMALLY MIXED 

STATES 


The decoding requires the preparation of maximally mixed states to be placed in the multiplicity register. For a 
given value of j, this is accomplished by generating a maximally entangled state of rank irij. In the following we 
present a three-step protocol for this purpose. 

1. Choose an integer n = O(N) such that rrij £ (2" -1 ,2"]. Prepare n maximally entangled qubit states. The 
resulting the state is p = [|$ + )($ + |]® n , with |<I> + ) = (100) + \l\))/\/2 and lies in a space of dimension 2 2n . 

2. Perform the measurement in the computational basis on one qubit of each entangled pair. The measurement 
outcomes of the individual qubit measurements are saved in a sequence of n binary digits, let us denote it by y. 

3. Compare the string y with the binary expression of rrij. If y, as a number, is larger than nij, the protocol fails 
and we have to restart by preparing again n maximally entangled qubits. Otherwise, we keep the remaining 
qubits, which, on average, will be in a maximally entangled mixed state of rank rrij. 

The last step can be seen by noting down the quantum operation C yes corresponding to the successful outcomes of 
the projective measurement, given by 

c yes (o-) = Y \y)(y\ a \y)(y\ ■ 

■y<mj 

The protocol is successful in more than half of the cases. For that reason, the probability of failure vanishes exponen¬ 
tially in the number of repetitions l as p no < 2 _i . To ensure that the error is vanishing fast enough with the number 
of state copies N , we repeat the protocol N times. Then, the complexity of the protocol is comprised of preparing the 
qubit states, which takes O(N) steps, and from comparing the n digit binary strings on a classical computer, which 
also takes O(N) steps. By repeating the protocol N times, the overall complexity yields 0(N 2 ). It is safe to run 
the protocol N times to assure for an exponentially vanishing error, because the complexity of the decoding is still 
dominated by the Schur transform. 
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ZERO-ERROR COMPRESSION FOR QUANTUM SYSTEMS OF DIMENSION d > 2 

In this and the following sections, we generalize our results to quantum systems of arbitrary finite dimension d < oo. 


Upper bound on the number of encoding qubits 

Theorem 4. In dimension d, every ensemble of N identically prepared mixed states of rank r can be encoded without 
error into less than (2di — r 2 + r — 2) /2 log(IV + d — 1) qubits. 

The proof is based on the Schur-Weyl duality, which allows one to decompose the IV-copy Hilbert space as 

u® N ~ 0 (K x ®Mx), 

where 7Zx is a representation space, M.\ is a multiplicity space, and the sum runs over the set 3W,d of all Young 
diagrams of N boxes arranged in d rows, parametrized as A = (Ai,..., Ad), with Ai > A 2 > • • • > A d, Yli =1 = IV. 
We use the notations 


d\ = dim 1Z \ 


and 


to a = dim AI a- 

Relative to this decomposition, every state of the form p® N where p has rank r can be cast into the form 


P 0N ~ (f) dA,JV (p\ 
Aey N '■ 


I 


m\ 

TOa 


where p\ is a quantum state on 1Z\, I mx is the identity on A4\, and q\ t N is a suitable probability distribution. Note 
that only the Young diagrams with r rows or less are present here (for this fact, see e.g. [37]). 

The proof of Theorem 4 makes use of the following lemmas: 

Lemma 2. For every A € Wv.rj one has d\ < (N + d — i)( 2 *'- J ' 2 - r )/ 2 . 

Proof. The dimension can be expressed as 

~ A? — + j) 

n t\k'. ’ 

cf. Eq. (III. 10) of [38]. Since A* = 0 for i > r, we have the following chain of (in)equalities 

Ill<j<i<r(^ — Ai ~ * + j ) • ril<i<r<f — * + j) ' Y[r<i<j<dU ~ l ) 


d\ = 


d\ = 


nth-' 


< 


< 


(iv+r-i)(o • (N+d- iy( d ~ r '> ■ ntr 111 
nti k\ 

(N + d- l)( 2 *-r 2 -r)/2 

nttY ' 


(35) 


□ 


Lemma 3. The total dimension of all the representation spaces corresponding to Young diagrams with no more than 
r rows is upper bounded as 

E , s 2dr-r 2 +r-2 

d x < (N + d- 1)-s-. 

a ey N . r 
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Proof. By Lemma 2 one lias 


d x <(N + d-l) 

x ey N , r 


2dr-r z -r 


\y N ,r 


2dr-r z +r-2 


<{N + d-iy 

having used the equality |3dv,r| = [39] and the elementary bound (^^j) 1 ) < (N+l) r ~ 1 < (N+d— l) r_1 . □ 

Proof of Theorem 4. A zero-error compression protocol is given by the following encoding and decoding channels: 

£( p)= 0 dWn AP n A ] 

a ey N , r 


J m\ 

m\ 


np)= 0 Pxp'Px' 

A G3^1V,r 

where II A is the projector on TZ\ ® M\ and P\ is the projector on TZ\. The encoding space is TL C nc — © A6 y w 
and has dimension d enc = y2xey N ^A> which we bound as 

dene = ^ [ d\ 

Aeyjv.r 

2dr-r 2 +r-2 


Px 


< (N + d- 1 )- 


having used Lemma 3. 


□ 


Lower bound on the number of encoding qubits used by the zero-error protocol 


Here we give a lower bound on the dimension of the encoding space in the zero-error protocol described in the proof 
of Theorem 4. Precisely, we have the following 

Lemma 4. The total dimension of all the representation spaces corresponding to Young diagrams with no more than 
r rows is lower bounded as 

T, d\>c(r,d)N ^ , (36) 

xey N , r 


where c is a suitable function. 

Proof. For simplicity, we use the notation f(N,r,d) > g(N,r,d) to mean that there exists a function c(r,d) such 
that f(N,r,d) > c(r,d)g(N,r,d) for every N. If f(N,r,d) > g(N,r,d ) and g(N,r,d) > f(N,r,d), then we write 
f(N,r,d) « g{N,r,d). With this notation, we have 

dx> n 


having used Eq. (35). Consider the case when N is a multiple of r(r + l)/2 and define s — 2N/r(r + 1). Define the 
subset of Yang diagrams 

Score = |A € 3dv,r | A G [(f — i + l)s — —, (r — i + l)s + — , 

For every diagram in S cor e we have the lower bound 


Vi = 1,..., i— 1 


dx> 


> 


( A * A j) 


l<i<r— 1 


n [o - *) s - - 

l<i<j<r 


n ( A * - X r) 

1 

n I- 


n a 

1 <i<r<j<d 



n Ar 


r<.j<.d 


l<i<r—l 


r - %)S -2l 


n ( r -*)4 { n d 


l<i<r<j<d 


r<j<.d 


2 dr-r^-r 

S 2 

2 dr-r 2 -r 

N -$- 


(37) 
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Now, the total dimension of the subspaces with Young diagrams in S CO re an be lower bounded as 

2d r _^2_^ 

Y dx>N^~\S coie \ 

AGScore 

2dr-r 2 -r / S \ r—1 

= (-) 

2 dr-r 2 -r „ t 

« TV-5-TV 1 

2rd-r 2 +r-2 

= TV-5-. 

Since S core is a subset of 37v,r; we obtain Eq. (36). □ 

Following the steps adopted in the d = 2 case, it is also possible to show that the upper bound of Lemma 4 is 
actually an upper bound for every zero-error protocol that works for a complete ensemble of mixed states- i. e. for 
an ensemble of the form {p® N ,p g } where the state p g is non-degenerate and the probability distribution p g is dense 
on SU(d). Essentially, the argument is based on the use of Proposition 3, which can be applied here to all the SU(2) 
subgroups of SU(d). 


APPROXIMATE COMPRESSION FOR QUANTUM SYSTEMS OF DIMENSION d > 2 

Compression protocol 

Here we consider ensembles of TV identically prepared mixed states, each of them having the same spectrum. Every 
such ensemble can be written in the form {pf N ,P g }, where p g is a density matrix of the form 

Pg = U g p 0 Ul , g G SU(d), 

Po is a rank-r density matrix with non-degenerate positive eigenvalues, and p g is a probability distribution over the 
group S U (rf). For ensembles of this form, we have the following 

Theorem 5. For every e > 0 there exists an integer N 0 such that for every TV > N 0 the ensemble {pf N ,p g } can be 
compressed with error less than e into TV enc qubits, with 


2 dr — r 2 — 1 — ■ 


log(TV + d — 1) + 


m + r — 1 


log 


4d(d + 1) ln(7V + 1) + 8 In 


O 


VnJ J 


and m := where pi be the cardinality of the set {j : j > i ,pj = pi}. We notice that m = 0 when the spectrum 

is non-degenerate. 


The proof of the theorem is based on the Schur-Weyl decomposition 


® N 

rg 


AS3N,, 


= 0 5A,iv(C/( A) P0,A^ A)t ®^ 


A 

m\ 


(38) 


where po,A is a fixed density matrix on 1Z\ and Ug^ is the irreducible representation of SU(g?) acting on 1Z\. The key 
point is that the probability distribution is concentrated on the Young diagrams such that the vector 


P a := 


Ai 

A’ 


A d 
TV 


is close to the vector of the eigenvalues of p 0 [40, 41], listed as 

P = (jpl, ■ ■ ■ ,Pd) , Pl>P2>-">Pr> Pr +1 = • • • = Pd = 0 . 


(39) 


(40) 


Precisely, we will use the following 
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Lemma 5 ([40, 41]). Let p\ and p be the vectors defined in Eqs. (39) and (40), respectively, and let d(a,b ) := 
\ 12i I a i ~ k | be the total variation distance between two vectors. Then, one has 

Prob [A : d(p x ,p) > x] < (N + 1 ) d(d+1)/2 . e ~ 2Nx2 , 

with Prob [A : d(p\,p) > x] := X^A d(p x p)>x 9a, Qn,x being the probability distribution in Eq. (38). 

The idea of the proof is to discard all Young diagrams whose probability vector p\ falls outside in a ball of size 
0(1/y/N) around the vector p. The dimensions of the subspaces associated to the remaining diagrams can be bounded 
with the following 

Lemma 6. The maximum dimension of a subspace 1Z\ satisfying d(p\,p) < x is upper bounded as 

dx < (4 Nx + r) m (N + d — 1) adr ~? r+1> ~™ . ( 41 ) 


Proof. The dimension can be bounded as 

~~ A? ~ * + j) 


dx = 


n tXk\ 


< 


< 


< 


< 


ril<!<r { (^* Ai * + i) A? 


n t\k\ 

rii<i<r { n 4<j < i+w (^Nx+u i+lil<j < d ( N + d ~ i)]} 

ntlu 

Ui<i< r (^x + OiY' (N + d- I)*"*"* 

n t\k\ 

2dr _rfr+ll 

(ANx + r) m (N + d - 1)- 2 - m 

n tXk\ ’ 


having used the fact that the ball S = {A £ Y/v.-r : d(p x ,p ) < x} is contained in the hypercube S' = {A £ : 

|A i/N — pi\ < 2x , Vi = 1,..., r — 1}, so that, for pi = pj, i < j, one has A * — A j < ANx. □ 

Lemma 7. The total dimension of the subspaces satisfying d(p\,p) < x satisfies 


E 

AeCViv,,-: d(px,p)<x 


d\ < (N + d—l) 2dr 2 r+1> -™(4 Nx + r) 


ra+r—1 


Proof. Immediate from Lemma 6 and from the fact that the ball S = {A £ 3^v jr : d(p\,p) < a:} is contained in the 
hypercube S' = {A£ : |A,;/./V — pt\ < 2x ,Vi = 1, ... ,r — 1}, yielding the bound 


|S| < jS'| < (4 Nx) r 


Proof of Theorem 5. To compress within an error e, we choose the encoding and decoding channels 

£(p) = 0 Tr^ A [n AP n A ] © Tr [ P (i® N - n e )] 

AeS e 

W= 0 (Pxf/P. A®^) , 

with n e = © AgSe n A , Supp(pfaii) C Hone = ® AeSe and 


S e := {A G y Ntr | d(p x ,p) < x e } , 


d(d- 


l)/2 ln(lV 4 
2N 


l) + ln(l/e) 


□ 
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The value of x e is chosen in order to bound the compression 
e N = 1 -\\Vo£(pf N )-pf N \\ Mg g SU(d) 

< \ Tr [p {I® N - n e )] ||P(pfail) - P 3 ,fail|| , Pg ,fail 

<Tr[p (7 0JV - n £ )] 

= ^2 q\,N 

A£Se 

< (TV + i) d ( d + 1 )/ 2 . e - 2Ara:2 

= e, 

the last inequality coming from Lemma 5. On the other hand, the encoding subspace has dimension 


error as 


==® 

A£S e 


Q\,N 


TV [p (I® N — II e )] 


t/f) 


P 0,A 


U< A)t®=^ 
toa 


dene — ^ ' C?A 
AeS e 

< (TV + d- 1) Q 

< (TV + d- 1) 


l (4 Nx + r) 


m+r— 1 


dr -Mr+H- m N ^+r=x 


4d(d + 1) ln(TV + 1) + 8 In ( - ) + O ( -j= 


2dr-r‘ i -1-n 


< (TV + d- 1)-? 


4d(d + 1) ln(TV + 1) + 8 In 


O 


1 

TnJ J 


having used Lemma 7 and the definition of x e . Hence, the number of encoding qubits satisfies 
Nenc < logd enc 


2 rd — r 2 — 1 — m , ,, r , » m + i — 1, 

< -o- lo s( N + d - 1) H---log 


4d(d + 1) In (TV + 1) + 8 In ( - ) + O ( ~^= 


□ 


Optimality proof in the presence of symmetry 


Here we prove the converse of Theorem 5. Our proof is valid for protocols where the encoding is covariant and the 
decoding preserves the nonabelian charges [42] identified by the Young diagrams. Precisely, we assume that 

1. the encoding space 'H onc supports a unitary representation of the group SU(d), denoted by {V g \ g G SU(d)}. 


2. the encoding channel satisfies the covariance condition £ o U g = V g o £, Mg G SU(d). 

3. the decoding channel V preserves the nonabelian charges associated to SU(d), namely, for every input state p, 
one has 


Ti-[H x V(p)\ =Tr 



mx g yN,d, 


(42) 


where Ha is the projector on the direct sum of all the invariant subspaces of 77 en c with Young diagram A. 


By the same argument as in the qubit case, the error of the compression protocol satisfying the above assumption can 
be lower bounded as ejv > (1/2) X)asS 9a, at, with S being a subset of yN,r specified by the protocol. The encoding 
dimension is given by d enc = 'YxeS 9a,jv- We have the following theorem. 

Theorem 6. Every compression protocol that encodes a complete TV -qubit ensemble into 


( 2 dr — r 2 — 1 — : 


- 5 log TV, 


6 > 0 , 


qubits with covariant encoding and a decoding that preserves the nonabelian charges will necessarily have error e > 1/2 
in the asymptotic limit. Here m := X/=i Mo where fii be the cardinality of the set {j : j > i,pj = pi}. We notice 
that m = 0 when the spectrum is non-degenerate. 
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To prove the theorem, we first define the cubic lattice 


Hr — < A € Wv,r 


A i £ 


VN 1+e VN 1+e 

PiN --—, PiN H--— 


V i = 1,..., r — 1 


(43) 


for any constant e £ (0,1). With this definition, the sum of the probability (?a,jv when A ^ H e vanishes exponentially 
in N. Precisely, we have the following lemma. 

Lemma 8 . For the set H e defined by Eq. (43), the following bound holds. 

E , \ d ( d +l) _ N e 

Q\,N <(N + 1) ^ e * . 

A£H e 

Proof. For any Young diagram A not in the set H e , there exist at least one j such that |Aj — PiN | > V N 1+e /2. Thus 
we have 


d{p\,p) > ^ 


h 

N 


■Pi 


> 


1 


Wn 1 -*' 


Substituting this fact into Lemma 5, we immediately get the following lemma. 


□ 


Now we start to bound the probability distribution q\ t N within the set H e . Notice that the exact expression of q\ t N 
is given as [37] 


q\,N = 


det A 
det E 


• m \ 


(44) 


where the matrix E is independent of N (and thus its expression is not relevant to bounding the probability) and the 
matrix A is a rank r square matrix defined as the following. 


A ij — 


flj 1 


Q (A i+r-i-fi) 


3=0 


p 


A i+r—i—fj-j 


(45) 


with Hi defined in Theorem 6 . Notice that we follow the convention J][ i= o /(*) = 1- We first prove the following bound 
of det A. 

Lemma 9. For any A in the set H e defined by Eq. (43), the following bound holds asymptotically for large N: 


(1 + e)n 

det A < N^~ 


IL 


= ^2 f^i- 


Proof. Suppose that there are k distinct positive values in the spectrum, and the i-th biggest value has degeneracy 
r*. We can then divide the set {1,... ,r} into k subsets Li U ■ • • U L*,, corresponding to the distinct eigenvalues, so 
that L, is the set of indices corresponding to the i-th biggest eigenvalue. Recalling that rj is the degeneracy of the 
j-th eigenvalue, we have 


L - = \ + 

[i=i f=i 

Notice that, by definition, one has 

Pi=Pk Vl,k £ L*. 

With the above definition, the spectrum now reads 

Pi * Pri Pr\-\-l * Pri~\~r 2 -' > * * * P \^ k ~ 1 r ._li * Pr Pr -\-1 * Pd 0* 

S -v-' S -V-' ^i = l ^ _ / 

Ll 1-2 | 


(46) 
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Correspondingly, we define a subgroup P, of the group S r , consisting of the product of permutations that act within 
the subsets { L, }. Precisely, 


P r := x x • • • x \ a^ € S ri ; i = 1, ..., fc|. 

With the above definition, we divide det A into two terms 

det A = t\ + f 2 

ti =e s g n ( CT ) (n a ^.1 


<rGP r 


\i=l 


(47) 


^ = y (n) ’ 

agp r 


\i=l 


denoting by cr, the index that comes from applying a to i. 

Let us bound t\. By definition, P r contains every permutation er such that pi = p ai for every i. Therefore, we have 


M<T.' 1 


h = Y sgn ( a ) \ 

a£ P r \ 

is 

(A i+r-i-P) 

. 0=0 

= Y sgn ( cr ) \ 

a£ P r 

[n 

M® - j 1 

(A i+r-i-P) 

. 0 =° 

= Y sgn ( cr ) \ 

ere P r \ 

It? 

M< Ti~l 

(A i+r-i-P) 

. 18=0 

/ r 

A 

1 ” r Mcr^ — 1 


A iA a 

P*i 


Pi 


A 


Pi 


A 


IL 


n> 


-Mi 


il 




Y sgn ^ c 


creP r 


II IT (\+r-i~P) 


i=1 3=0 


Since i and cq are always in the same subset L; (for suitable l), we can rewrite the term n[=i 11 / 3=0 1 (A./ + r — i — p) 

as nti n, e L, + r-i-p). We then have 


*i= n pY r - 1 -^) y {n 




^=i 


aePr 

k 


1=1 


n e 


M cr^ 1 

n n (\i+r-i-p) 

i6L, /3=0 

n 11 (A* + r - i - P) 


l—l | a( l ) £S r 
k 


n E sgn(>>) 

;=i ^(Oes,., 


iSL ( /3=0 

n < A <), 

i£L, 


r (0 


Ih 

Ai=l 


A i-\-r—i—fii 


j [ det A i j . 


\i =l 


Here A/ is a rank r; square matrix defined as 


n-j-i 

( A i)ij= LI (A i+r-i-P), 


/3=0 


observing that pj assumes the values n — 1, n — 2,..., 1,0 for the indices in L/. The determinant of A/ equals to 
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IlKiW** — Aj — *)• Combining this with the definition of H e (43), we have 


to 




1 [ ]3[ (A» — A j + j — i) 

l—l l<i<j<ri 


n 


A Hi 

Pi 


ri(^) 

.1=1 


r i( r i - 1 ) 


Up 

Ki=1 


A*+i— 


(l + e>n 

= TV 2 


IL 




(l + e)r, 

TV 2 


The last step follows from the fact that 


IL 

\i=l 


r k ri 

= E^ = EE+ _ + 

*=i »=i j=i 


Next, we bound the second term t 2 in Eq. (47) as 


*2< e (u^ 

a<£P r \i=l 


= e n 

cr£P r I i=l 


Mo - j 1 

II (A*+r-i-j) 
j=o 


Xi+r—i—Ha 
P*i 


^ E 

<T^P r 


jQ(TV + r — l) p<7i pi 


Ai+r—z— 


i=l 


= ( jv + r-ir E 


<T^P r U=1 


n 


Aj+r —£—Ha 
Pai 


= (N + r - l) m 53 


<T0P r U=1 


n 


= (JV + r-ir E 


crgPr U=1 


n 


TV 

Pi 

PV 
Pi 


A«+i— 


JVpi+0(\/JV 1 + e 


n* 


Aj+r—j — n a 


j =1 


iu 


Aj+r—j— 


(TV + r - l) m 53 ex P [-ACD(p| kp)] 

<T^P r 


np 

i=i 


3=1 

Xi+r—i—Ha 


(48) 


where ZT(p||q) := X+P* l n (Pi/9i) is the Kullback-Leibler divergence and a p := ( <j Pi ,..., er Pr .). Now, since cr fL P r , we 
always have D(p||<T p ) > 0. Therefore, the second term in Eq. (47) vanishes exponential in TV. Combining this fact 
with Eq. (47) and Eq. (48) we get the desired bound on det A. □ 

Lemma 10. For any A in the set H f defined by Eq. (43), the following bound holds asymptotically for large TV. 

< jv- 2 ^ 2 — 2 ~ ( i +c) + 


Proof. The dimension of A4 a is given by 


TV! 


m a = —j- 

rL=i(A; + d - i)\ 


11 (A» - Aj + j - i) 
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(see e. g. [37]) and can be bounded as 

rn\ < 


< N 


1 (N 

\d— 1 \d— 2 \d—r \ \ 
A-^ A2 • • • Ar \ / ' 

N 


]^[ (A* — A j + j — ' 

l<i<j<d 


J^[ (Ai - A j + j — i) 

l<i<j<d 


for any A € H e . Substituting the above bound and the bound in Lemma 9 into Eq. (44), we have 

' N A 


Q\,N 


< 


N 2 


clet S 


n 


\ \ 2 dr — r' 1 

it A-f 


]^[ (A, — Aj + j — i) 

l<i<j<d 


2dr — r^ —r (1 + e) 

< AT 2 m 


(iV,p,A) [] (Aj — Xj + j — i) 

l<i<j<d 


< N' 


2dr—r^ — 1 — (l-|-e)n 


(A* — A j + j — i) 

l<i<j<d 


which holds for any A € H e . The last inequality comes from the upper bound of the multinomial m(N,p , A). Finally, 
we get the desired bound of q\ t N/d\ by combining the above bound with the expression of d\ 

, _ ni<i<j<d(Ai — Aj — i + j) 


□ 


Finally, we can bound the error of any compression protocol with an encoding set S and with the encoding dimension 
dene = O (iV 2 " r2 1 as 


eN — b <?A ’ JV 


ags 


> 


2 

1 

2 

1 

> - 
“ 2 


1 

> - 
“ 2 


>1 

~ 2 




A£S 


1 ~ H d\,N — 


Q\,N 


A£H 


6/r, 


AeH 5/m nS 


1 - V q\ n - nrax 
^ ’ ash 5 . 

A£H s/m 


1 - V qx ,N - max 
2^ AGH 5/m 

A£H e/m 


q\,N 

q\,N 


AGS 


d(d-\- 1) 1 » r — S 

1 ~(N + 1) 2 e~^ Nm - N~* 





